首页> 外文OA文献 >Throughput-Distortion Computation Of Generic Matrix Multiplication: Toward A Computation Channel For Digital Signal Processing Systems
【2h】

Throughput-Distortion Computation Of Generic Matrix Multiplication: Toward A Computation Channel For Digital Signal Processing Systems

机译:通用矩阵乘法的吞吐量 - 失真计算:   走向数字信号处理系统的计算通道

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The generic matrix multiply (GEMM) function is the core element ofhigh-performance linear algebra libraries used in manycomputationally-demanding digital signal processing (DSP) systems. We proposean acceleration technique for GEMM based on dynamically adjusting theimprecision (distortion) of computation. Our technique employs adaptive scalarcompanding and rounding to input matrix blocks followed by two forms of packingin floating-point that allow for concurrent calculation of multiple results.Since the adaptive companding process controls the increase of concurrency (viapacking), the increase in processing throughput (and the corresponding increasein distortion) depends on the input data statistics. To demonstrate this, wederive the optimal throughput-distortion control framework for GEMM for thebroad class of zero-mean, independent identically distributed, input sources.Our approach converts matrix multiplication in programmable processors into acomputation channel: when increasing the processing throughput, the outputnoise (error) increases due to (i) coarser quantization and (ii) computationalerrors caused by exceeding the machine-precision limitations. We show that,under certain distortion in the GEMM computation, the proposed framework cansignificantly surpass 100% of the peak performance of a given processor. Thepractical benefits of our proposal are shown in a face recognition system and amulti-layer perceptron system trained for metadata learning from a large musicfeature database.
机译:通用矩阵乘法(GEMM)函数是在许多需要计算的数字信号处理(DSP)系统中使用的高性能线性代数库的核心元素。我们提出了一种基于动态调整计算的不精确度(失真)的GEMM加速技术。我们的技术采用自适应标量压扩和舍入来输入矩阵块,然后采用两种形式的浮点打包以允许同时计算多个结果。由于自适应压扩过程控制并发性的增加(通过打包),因此处理吞吐量的增加(以及相应的失真增加)取决于输入数据统计信息。为了证明这一点,针对广泛的零均值,独立均匀分布的输入源,为GEMM推导了最佳的吞吐量失真控制框架。我们的方法将可编程处理器中的矩阵乘法转换为计算通道:增加处理吞吐量时,输出噪声( (i)较粗的量化和(ii)由于超出机器精度限制而导致的计算错误,因此误差增加)。我们表明,在GEMM计算中存在一定失真的情况下,所提出的框架可以显着超过给定处理器峰值性能的100%。我们的建议的实践优势在面部识别系统和多层感知器系统中得到了展示,该系统经过训练可以从大型音乐特征数据库中进行元数据学习。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号